Distilling dialogues - A method using natural dialogue corpora for dialogue systems development

نویسندگان

  • Arne Jönsson
  • Nils Dahlbäck
چکیده

We report on a method for utilising corpora collected in natural settings. It is based on distilling (re-writing) natural dialogues to elicit the type of dialogue that would occur if one the dialogue participants was a computer instead of a human. The method is a complement to other means such asWizard of Oz-studies and un-distilled natural dialogues. We present the distilling method and guidelines for distillation. We also illustrate how the method affects a corpus of dialogues and discuss the pros and cons of three approaches in di erent phases of dialogue systems development.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using the Process of Distilling Dialogues to Understand Dialogue Systems

Distilled dialogues, i.e. re-written natural dialogues, are a useful complement to dialogues collected in Wizard of Oz-experiments or in natural settings for development of dialogue systems. However, the distillation process itself also provides insights on human-computer interaction and on properties of dialogue systems. In this paper we present the distillation process, including how the guid...

متن کامل

The Negochat Corpus of Human-agent Negotiation Dialogues

Annotated in-domain corpora are crucial to the successful development of dialogue systems of automated agents, and in particular for developing natural language understanding (NLU) components of such systems. Unfortunately, such important resources are scarce. In this work, we introduce an annotated natural language human-agent dialogue corpus in the negotiation domain. The corpus was collected...

متن کامل

Automatic annotation of context and speech acts for dialogue corpora

Richly annotated dialogue corpora are essential for new research directions in statistical learning approaches to dialogue management, context-sensitive interpretation, and contextsensitive speech recognition. In particular, large dialogue corpora annotated with contextual information and speech acts are urgently required. We explore how existing dialogue corpora (usually consisting of utteranc...

متن کامل

Correlations between dialogue acts and learning in spoken tutoring dialogues

We examine correlations between dialogue behaviors and learning in tutoring, using two corpora of spoken tutoring dialogues: a human-human corpus and a human-computer corpus. To formalize the notion of dialogue behavior, we manually annotate our data using a tagset of student and tutor dialogue acts relative to the tutoring domain. A unigram analysis of our annotated data shows that student lea...

متن کامل

Automatic analysis of real dialogues and generating of training corpora

The development of computerized information retrieval dialogue systems communicating with the user in natural language requires the implementation of an effective training procedure with the aid of which the main modules of the dialogue system can be partly automatically developed. The presented paper describes an attempt to create the generating sentence templates automatically, using a specia...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000